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»  I.  Research  Objectives 


and  Average  Cost  Problems  in  Decision  Processes 
Finite  State  Spaces  FINAL  F49620-79-C-0123  /\J 


The  objective  of  the  supported  research  was  a  continuation  of  the 
principal  investigator's  analysis  of  decision  processes  with  arbitrary 
decision  sets,  with  special  emphasis  on  two  classical  payoff  functions. 
The  first  was  a  process  with  single  fixed  goal  in  which  the  objective  is 
to  maximize  the  probability  of  reaching  the  goal,  and  to  minimize  the 
expected  time  to  the  goal.  A  specific  objective  was  to  determine  whether 
one  can  do  as  well  with  stationary  strategies  as  he  can  with  strategies 
which  take  the  whole  past  into  account.  The  second  object  of  study  was 
the  average  reward  payoff  in  finite  state  decision  processes,  a  specific 
objective  being  to  determine  whether  or  not  strategies  based  only  on  the 
current  time  and  state  are  as  good  as  those  which  take  the  whole  past 
into  account. 


II.  Status  of  the  Research 


/ 


A  complete,  affirmative  answer  was  found  to  the  first  question; 
in  every  dynamic  programming  or  gambling  problem  with  single  fixed  goal 
and  finite  state  space,  there  exists  a  stationary  strategy  which  not  only 
uniformly  (nearly)  maximizes  the  probability  of  reaching  the  goal,  but 
also  uniformly  (nearly)  minimizes  the  expected  time  to  the  goal.  Techniques 
in  the  proof  of  this  result  were  considerably  generalized  to  answer  several 
questions  about  decision  processes  with  arbitrary  state  spaces  and  total- 
cost  criteria.  It  was  shown  that  in  a  countable  state  decision  process 
with  non-negative  costs  depending  on  the  current  state,  the  action  taken, 
and  the  following  state,  there  is  always  available  a  Markov  strategy  which 
uniformly  (nearly)  minimizes  the  expected  total  cost.  If  the  costs  are 
strictly  positive  and  depend  only  on  the  current  state,  there  is  even  a 
stationary  strategy  with  the  same  property. 


Investigations  of  these  results  led  peripherally  to  several  results 
in  optimal  stopping  theory,  and  in  classical  probability  theory.  Universal, 
best  possible  constants  were  found  which  compared  the  optimal  expected 
return  of  a  decision  maker  with  the  expected  supremum  of  a  sequence  of 
random  variables.  For  example,  it  was  found  that  for  every  sequence  of 
independent  random  variables  taking  only  values  between  zero  and  one,  the 
difference  between  the  optimal  stop  rule  expectation  and  the  expected  value 
of  the  supremum  of  the  random  variables  is  no  more  than  one-fourth.  Results 
in  classical  probability  stemming  from  this  research  include  a  stronger  form 
of  the  Borel-Cantelli  Lemma,  and  a  very  general  conditioning  principle  for 
strong  laws  which  conclude  the  partial  sums  converge  almost  surely. 


For  the  question  of  existence  of  good  Markov  strategies  in  decision 
processes  with  average  reward  criteria,  various  partial  results  have  been 
obtained,  including  examples  showing  that  the  limit  of  good  strategies 
for  the  discounted  reward  payoff  is  not  necessarily  average-reward  good. 
This  research  is  still  in  progress  and  it  is  hoped  that  a  complete  answer 
to  the  finite  state  case  is  not  far  away. 


i 

. T R  FORCE  OFFICE  OF  SCIENTIFIC  RESEARCH  (AISC; 

,  l  :S  DF  TRANSMITTAL  TO  DDC  ' 

. ceohnioal  report  has  been  reviewed  and  is 
.ived  for  public  release  IAN  AIR  190-12  (7b). 
\-it  rtbutlon  is  unlimited. 


A.  3.  BLOSE 

•  ••‘1'ical  Information  Of  floor 


Approved  for  public  ’’eleassj 

-  distribution  unlimited.  ✓ 

81  1  18  005 


i 


III.  List  of  Publications 


1.  "On  Reaching  a  Goal  Quickly,"  Technical  Report. 

2.  "Ratio  Comparisons  of  Supremum  and  Stop  Rule  Expectations," 

(with  R.  Kertz) ,  submitted  to  Z.  Wahrscheinlichkeitstheorie. 

3.  "Decision  Processes  with  Total-Cost  Criteria,"  (with  S.  Demko) , 
accepted  for  publication  in  Annals  of  Probability. 

4.  "Additive  comparisons  of  Stop  Rule  and  Supremum  Expectations  of 
Uniformly  Bounded  Independent  Random  Variables,"  (with  R.  Kertz),  submitted 
to  Proceedings  of  the  A.M.S. 

5.  "A  Stronger  Form  of  the  Borel-Cantelli  Lemma,"  submitted  to 
Pacific  Journal  of  Mathematics. 

6.  "Comparisons  of  Stop  Rule  and  Supremum  Expectations  of  i.i.d. 
Random  Variables,"  (with  R.  Kertz),  submitted  to  Annals  of  Probability. 

7.  "Conditional  Generalizations  of  Strong  Laws  Which  Conclude  the 
Partial  Sums  Coverage  Almost  Surely,"  submitted  to  Annals  of  Probability. 


IV.  Spoken  Papers 
A.  Presented 

1.  "Betting  Against  a  Prophet,"  seminar  in  Mathematics  Department 
at  University  of  California  at  Berkeley,  August  1979. 

2.  "A  Stronger  Form  of  the  Borel-Cantelli  Lemma,"  seminar  in  the 
Mathematics  Department  at  University  of  California  at  Berkeley,  September 
1979. 


3.  "On  the  Existence  of  Good  Markov  Strategies,"  Colloquium 
Presentation,  Statistics  Department,  University  of  California  at  Berkeley, 
October  1979. 

4.  "Markov  Decision  Processes,  Gambling,  and  Dynamic  Programming,1' 
Colloquium  Presentation,  Mathematics  Department,  University  of  Hawaii, 
April  1980. 

5.  "Conditional  Generalizations  of  Strong  Laws,"  seminar  in  the 
Mathematics  Department,  University  of  California  at  Berkeley,  June  1980. 

6.  _  ,  Probability  Seminar, 

Mathematics  Department,  Georgia  Institute  of  Technology,  October  1980. 

7.  "A  Stronger  Form  of  the  Borel-Cantelli  Lemma,"  Probability  Seminar 
Mathematics  Department,  Georgia  Institute  of  Technology,  Noven&er  1980. 

B.  Scheduled 

1.  "Conditional  Generalizations  of  Strong  Lavs,"  Annual  American 
Mathematical  Society  Meeting,  San  Francisco,  January  1981. 

2.  "Finite  State  Decision  Processes,"  Operations  Research  Seminar, 
Georgia  Institute  of  Technology,  February  1981. 


UNCLASSIFIED 


SECURITY  Cl  ASSIF.CATION  OF  THIS  PAGE  (Wh  en  Dele Enten  d) 


-  DOCUMENTATION  PAGE 


4  TITLB  (and  Subtitle) 

GOA!,  AND  AVERAGE  .COST  PROBLEMS  IN  DECISION  PRO¬ 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


2.  GOVT  ACCESSION  NO.I  3.  RECIPIENT'S  CATALOG  NUMBER 


_IXEE.XJF  REPORT  A  PERIOD  COVEREO 


GOAI.  AND  AVERAGE  £OST  PROBLEMS  I 
CESSES  WITH  FINITESTATC  SPACES- 

^  ^ 


r~e.pt* 


7  AUTHORfs) 


Theodore  P./Hill 


PEFJ£P 


RMING  ORGANIZATION  NAME  AND  ADDRESS 


RMING.O 

XT  t 


B.  CONTRACT  OR  GRANT  NUMBER(«| 


F4 9620-7 9-C-^I23 “ 


10.  PROGRAM  ELEMENT.  PROJECT,  TASK 
AREA  A  WORK  UNIT  A4BMBERS 


AREA  A  WORK  UNITammaRE 

(&i 

61102F  ^2304^A3  / 


12.  REpemr  date- 


(id /fj’j  / 


^School  of  Mathematics,  Georgia  Inst,  of  Techno log; 
...Atlanta,  Georgia  30332 


II.  CONTROLLING  OFFICE  NAME  ANO  AOORE5S 

Air  Force  Office  of  Scientific  Research  //U/tf? 

Air  Force  Systems  Command,  USAF 

Bolling  A.F.B.,  Washington,  D.  C.  20332 


14.  MONITORING  AGENCY  NAME  A  AOORESSfff  different  from  Conlfollin*  Office;  IS.  SECURITY  CLASS,  (of  fMe  report; 

UNCLASSIFIED 


ISe.  DECLASSIFICATION/ DOWN  GRADING 
SCHEDULE 


IS.  DISTRIBUTION  STATEMENT  (of  tfi/e  Report; 


Approved  for  public  release,  distribution  unlimited. 


17.  DISTRIBUTION  STATEMENT  (of  the  abstract  entered  In  Block  10,  II  dllterenl  Irom  Report) 


19.  KEY  WORDS  ( Continue  on  reverse  side  if  necessary  and  identify  by  block  number) 

Decision  processes,  gambling  theory,  dynamic  programming,  Markov  processes, 
stationary  strategy 


20  ABSTRACT  (Continue  on  reverse  side  If  necessary  and  identify  by  block  number) 

'  The  principal  investigator  carried  out  research  in  probability  theory,  and,  in 
particular,  in  the  theory  of  decision  processes.  First,  it  was  shown  that  in 
every  finite  state  decision  process  (gambling  problem)  with  single  fixed  goal, 
there  always  exsits  a  stationary  strategy  which  not  only  (nearly)  maximizes  the 
probability  of  reaching  the  goal,  but  (nearly)  minimizes  the  expected  time  to  th« 
goal.  This  result  was  considerably  generalized  to  include  decision  processes 
with  arbitrary  state  spaces  and  total  cost  criteria. 
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Investigations  of  these  processes  led  to  results  in  optimal  stopping  theory, 
and  in  classical  probability  theory.  Universal,  best  possible  constants  were 
found  wfyich  compared  the  optimal  expected  return  of  a  decision  maker  with  the 
expected  supremum  of  a  sequence  of  independent  random  variables.  A  generali¬ 
zation  of  the  classical  Borel-Cantelli  Lemma  was  found,  as  was  a  very  general 
conditioning  principle  for  strong  laws  of  several  forms. 

The  question  of  existence  of  good  Markov  strategies  in  finite  state  decision 
processes  with  average  reward  criteria  was  addressed,  and  various  partial 
results  were  obtained,  although  the  general  case  was  not  settled. 


